ON THE RATE OF CONVERGENCE AND 
BERRY-ESSEEN TYPE THEOREMS FOR A 
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■ Abstract. We address the question of a Berry Esseen type the- 
(-^ I orem for the speed of convergence in a muhivariate free central 
OO ' limit theorem. For this, we estimate the difference between the 

operator-valued Cauchy transforms of the normalized partial sums 
in an operator-valued free central limit theorem and the Cauchy 

■ transform of the limiting operator- valued semicircular element. 

o' 

; 1. Introduction 

The free central limit theorem (due to Voiculescu [12] in the one- 
dimensional case, and to Speicher [10] in the multivariate case) is one of 
the basic results in free probability theory. Investigations on the speed 
I of convergence to the limiting semicircular distribution, however, were 

^ ■ taken up only recently. In the classical context, the analogous question 

^ . is answered by the famous Berry-Esseen theorem, which states, in its 

simplest version, the following: If Xi are i.i.d. random variables, with 
mean zero and variance 1, then the distance between S'„ := {Xi + ■ ■ ■ -|- 
Xn) I \fn and a normal variable 7 of mean zero and variance 1 can be 
estimated in terms of the Kolmogorov distance A by 

A(5„,7)<C^P, 

where C is a constant and p is the absolute third moment of the vari- 
ables Xi. 

The question for a free analogue of the Berry-Esseen estimate in the 
case of one random variable was answered by Chistyakov and Gotze [3J : 
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If Xi are free identically distributed random variables with mean zero 
and variance 1, then the distance between Sn := (Xi + ■ ■ ■ + X^^j 
and a semicircular variable s of mean zero and variance 1 can, under 
the assumption of finite fourth moment, be estimated as 

s) < c — 



n ' 



where c > is an absolute constant, and and 7714 are the third and 
fourth moment, respectively, of the Xj. (Independently, the same kind 
of question was considered, under the more restrictive assumption of 
compact support for the Xi, by Kargin [8].) 

In this paper we want to address the multivariate version of a free 
Berry-Esseen theorem. In contrast to the classical situation, the multi- 
variate situation is of a quite different nature than the one- dimensional 
case, because we have to deal with non-commuting operators and all the 
analytical tools, which are available in the one-dimensional case, break 
down. However, we are able to deal with this situation by invoking re- 
cent ideas of Haagerup and Thorbjornsen [6l|5], in particular, their lin- 
earization trick which allows to reduce the multivariate (scalar-valued) 
to an analogous one- dimensional operator-valued problem. Estimates 
for the operator- valued Cauchy transform of this operator-valued oper- 
ator are quite similar to estimates in the scalar-valued case. Actually, 
on the level of deriving equations for these Cauchy transforms we can 
follow ideas which are used for dealing with speed of convergence ques- 
tions for random matrices; here we are inspired in particular by the 
work of Gotze and Tikhomirov |4j , but see also [H [2] . Our main the- 
orem on the speed of convergence in an operator-valued free central 
limit theorem is the following. 

Theorem 1. Let 1 E B G A, E : A ^ B be an operator-valued 
probability space. Consider selfadjoint Xi,X2, ■ ■ ■ E A which are free 
with respect to E and have identical B-valued distribution. Assume that 
the first moments vanish, 

E[Xi] = 

and let 

r]:B-^B, 7]{b) = E[XM^] 
be their covariance. Denote 

a2 := sup ||i?[Xi6Xi]|| = \\r]\\ 



and 



bt^B 
11611 = 1 



a4 := sup \\E[XibXiXib*Xi 

bee 

\\b\\=l 
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Consider now the normalized sums 

r, Xi + ■ ■ ■ + Xn 

and their B-valued Cauchy transforms 

^"(^) ■= ^^h^} ^ 

on the "upper half plane" i3+ in B, 

B^ := {6 G I Im6 > and Imh invertible}. 

By G we denote the operator-valued Cauchy transform of a B-valued 
semicircular element with covariance rj. 

Then we have for all b G and all n that 

(1) - cm < 4c.(6) f + a, ■ \\-\\\) ■ ||-i-f , 

\ imo / imo 

where 

Cn{h) := ^IIt^ITv^ ■ (2«2 + \l «4 + 2a|) + -||7^||'^a2- 

In the one-dimensional scalar case one can derive from such estimates 
corresponding estimates for the Kolmogorov distance between the dis- 
tribution of Sn and the limiting semicircle s. This relies on the fact 
that the Kolmogorov metric measures how close the distribution func- 
tions of two measures are, and the Stieltjes inversion formula allows to 
relate the distribution function with Cauchy transforms. (In the proof 
of the classical Berry- Esseen theorem one follows a similar route, using 
Fourier transforms instead of Cauchy transforms.) For the multivariate 
case, say of d variables, where we would like to say something about the 
speed of convergence of the d-tuple of partial sums (^i'\...,^^))tothe 
limiting semicircular family (si, . . . ,5^), there is no nice replacement 
for the distribution function, and we also do not know of a canonical 
metric on joint distributions of several non-commuting variables which 
relates directly with the above estimates for operator-valued Cauchy 
transforms. 

However, there is a kind of replacement for this; namely, follow- 
ing again [5], estimates for Cauchy transforms of linear combinations 
with operator- valued coefficients of the variables {Sn \ • ■ ■ , Slf') should 
imply corresponding estimates for any non-commutative scalar polyno- 
mial in those variables and from those one should be able to estimate, 
for any selfadjoint non-commutative polynomial p, the Levy distance 
between p{Sn \ . . . , Sn^) and p{si, . . . , s^). However, one has to deal 
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with the following problem in such an approach: as is shown in [5] 
one can get the Cauchy transform of a polynomial p{si, . . . ,Sd) as a 
corner of an operator-valued Cauchy transform of a linear combina- 
tion P, with matrix-valued coefficients, of Si, . . . , s^; but, even if p is 
a selfadjoint polynomial, the corresponding matrix-valued operator P 
is not selfadjoint, and thus our operator-valued estimates, which were 
only shown for selfadjoint X, cannot be used directly for P; one would 
have to reprove most of our statements also for P. It is conceivable 
that this can be done in a similar manner as in [S]; as this approach 
is getting quite technical, we will pursue the details in a forthcoming 
investigation. 

Note that for proving such a kind of Berry- Esseen theorem for poly- 
nomials p{si, . . . ,Sd) one also has to face another kind of question: esti- 
mates for the difference of Cauchy transforms translate directly only in 
estimates for the Levy distance between the corresponding measures; in 
order to get also estimates for the more intuitive Kolmogorov distance 
one needs to know that the distribution of p{si, . . . ,Sd) has a contin- 
uous density, in particular, has no atoms. We conjecture that this is 
true for all non-commutative selfadjoint polynomials p in a semicircu- 
lar family, but this seems to be a non-trivial problem. Note that the 
question of absence of atoms can be seen as an analogue of the Zero- 
Divisor Theorem for the free group. We hope to address this question 
in some future work. 

The paper is organized as follows. In the next section we will first 
relate a multivariate free central limit theorem with a one- dimensional 
operator-valued free central limit theorem. The proof of Theorem [T] 
will be given in Section [3J 



2. Multivariate free central limit theorem 

2.1. Setting. Let {x^i^Y^_^, [x^2^Y^_^, . . . be free and identically dis- 
tributed sets of k selfadjoint random variables in some non-commutative 
probability space {C,ip), such that the first moments vanish and the 
second moments are given by a covariance matrix S = {o'ki)i.i=i- We 
put 

We know that {Si^\ ...,Sif^) converges in distribution for n — > 
oo to a semicircular family (si, . . . , s^) of covariance S. We want to 
analyze the rate of this convergence. We would like to get an estimate 
which involves only small moments of the given variables. As we will 
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see, the second and fourth moments of our variables will show up in 
the estimates and we will use the upper bound 

/?2 := max \aki\ = maxv^fxf'^xf^) 
k,i ' k,i ^ * 

for the second and the upper bound 



A-r)Jp)Jk)Jl), 

r,p,k,l 

for the fourth moments. 



P4 := max \ip[xl x( x\ x\ 

r,p,k,l 



2.2. Transition to operator-valued frame. We will analyze the 
rate of convergence of the multivariate problem, 

by replacing this by an one-dimensional operator-valued problem. The 
underlying idea for that is the linearization trick [Gj f5] that one can 
understand the joint distribution of several scalar random variables by 
understanding the distribution of each operator- valued linear combina- 
tion of those random variables. 

Let B = Mjv(C) and put A := Mjv(C) ® C = Mn{C). Then B ^ 
;B ® 1 C ^ is an operator- valued probability space with respect to the 
conditional expectation 

E = id® If : B®C B, b®cy-^ (p{c)b. 

For some fixed bi, . . . ,bk G M]\f{C) we put 



k=l 



and 



k=l 

Note that Xi, X2, ■ ■ ■ are free with respect to E and that we have 

Xi + --- + Xn 



Sn 



n 



The limit of S'„ is 



d 

:= J^fefc (g) Sk, 



k=l 
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which is an i3 = M7v(C) -valued semicircular element with covariance 
mapping r] : B ^ B given by 

d 

ri{b) = E[sb 0ls]=Y^ E[hk ® Sk ■ h ® I ■ h ® si] 

k,l=l 

d d 

= ^ bkbbnp{skSi) = ^ bkbbiaki. 

k,l=l k,l=l 

We want to determine the rate of convergence for S'„, to s. We will 
do this in the next section in the context of a general operator-valued 
free central limit theorem. 

3. Rate of convergence for operator-valued free 
central limit theorem 

3.1. Setting. Let 1 E B G A, E : A ^ B he an operator- valued 
probability space. This means that ^ is a von Neumann algebra, B is 
a sub von Neumann algebra, which contains the identity of A, and E 
is a conditional expectation from A onto B, i.e., a linear map which 
satisfies the property 

E[biab2] = biE[a]b2 

for all a e ^ and 6i, 62 £ B. 

Consider selfadjoint Xi,X2, ■ ■ ■ E A which are free with respect to 
E and have identical ;B-valued distribution. Assume that the first mo- 
ments vanish, 

E[Xi] = 

and let 

r]:B^B, r]{b) = E[XibXi] 
be their covariance. We will need 

a2 := sup = ||?7|| 

11611 = 1 



and 



a4 := sup \\E[XibXiXib*X; 

l|6|l = l 



Consider now the normalized sums 

Xi + ---+Xn 
■ — 1= • 

We know that Sn converges in distribution to an operator-valued semi- 
circular element s with covariance t], see [TT] 



BERRY ESSEEN FOR MULTIVARIATE FREE CLT 



7 



We want to estimate the rate of this convergence. Let us denote by 
the "upper half plane" in B, i.e., 

:= {b E B \ Imb > and Imb invertible}. 

We consider, for b E B+, the resolvents 

^„(,) _J_, urn := ^ 

and the Cauchy transforms 

G„(6) := E[R^{b)], Gib) := E[Rib)]. 
Gn and G are analytic functions in 

3.2. The main estimates. We will show that Gn{b) converges to 
G{b), where we have good control over the difference in terms of n 
and b. The idea for showing this is the same as in [6]. First we show 
that Gn satisfies an approximate version of an equation satisfied by G 
and then we show that this actually implies that Gn and G must be 
close to each other. 

Let us start with deriving the equations for G and G„. 

Since s is an operator-valued semicircular element with covariance r] 
we know [131 [Hj that its Cauchy transform satisfies the equation 

(2) bG{b) - l = 7]{G{b)) - Gib). 

We want to derive an approximate version of this equation for Gn- 
For this, we will look at E[SnRn 



Let us denote by Sn the version of S'„ where the i-th variable Xi is 



absent, i.e. 



n 



and by Rn and Gn the corresponding resolvent and Cauchy transform, 
respectively, i.e., 

b-sli^ 

and 

For each i = 1, . . . , n we have the resolvent identity 
i?„(6) = i?|:l(6) + ^i?H(6)-X,-i?[:l(6) 



+ -Rn{b)-X,-R^^{b)-X,-R^^{b). 

n 
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Now wc can write 

n ^ 

E[S^R^{b)]^y2E[-^.R^{b)] 

1=1 * 

+ ^E[X,.R^^{b)-X,.R^^{b)] 



'n 

+ ^E[Xi ■ R^{b) ■ Xi ■ • Xi ■ } 

Now we use our assumption that Xi, X2, . . . are free with resi^ect to 
which imphes that Xi is free from F^{b) with respect to E. This 
imphes that 

E[X,■R^^}{b)]-E[X^\■E[R^Kb)]^Q 

and 

E [X, . R^^ib) ■ X, . ^E[X,. E[i?W(6)] ■ X,] ■ E 

+ E[X,].E[R^^{b)-E[X,].R^^{b)] 
-E[X,].E[R^^(b)]-E[X,].E[R^:}(b)] 
^E[X,.E[R^^(b)].X,].E[R^:}ib)] 

So we have got finally 

(3) E[SMb)] {G^m ■ Gl:^^') + ^f^) ' 

where 

rf = -^E [Xi ■ R^{b) ■ X, ■ R^^{b) ■ Xi ■ 

We will now estimate the norm of r^K We could of course just esti- 
mate against the operator norm of X^; however, we prefer, in analogy 
with the classical case, to do better without invoking the operator norm 
and use only as small moments of Xj as possible. 

Note that for our conditional expectation E we have the Cauchy- 
Schwarz inequality 

\\E[AB]r<\\E[AA*]\\.\\E[B*B]\\, 

and also 

< E[A*A] and E[ABB*A*] < \\BB*\\ ■ E[AA*] 
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and 

p[^]ll< 

for any A,B & A. Thus, for any i — 1, . . . ,n, we can estimate 

\\E[X,K{b)X,I^}{b)X,I^}{b)]r 

< \\E[X,Rr,{b)Rn{byX,]\\- 

■ \\E[R^^ {by X, {by X, X, {b) X, {b)] \\ 

We estimate the first factor by 

\\E[XiRn{b)Rn{byXi\\\ < \\Rn{b)\\^-\\E[X,Xi\\\ 

^ \\Rn{b)r ■ um 

^a4Rn{b)\\' 

For the second factor we use again the freeness between Xi and 
Rn\b). Let us put 

i?:=i?[:l(6) 

Then Xi and R are *-free with respect to E and thus, by also invoking 
E[Xi] = 0, we have 

E^R* Xj^R* Xj^X^RXj^R^ — E R* ■ E ^X^ E^R*^ X^ X^ E^R^ ■ R 

R* ■r]{E[R*r]{l)R]) -R 



+ E 
-E 



R* ■ri{E[R*]ri{l)E[R]) -R 



and thus 

1 1 E [R*XiR*XiXiRXiR] \\ < e\r* -ElXiE [R*] Xi Xi E [R] Xi] ■ R 

E R* ■ r){E[R* r]{l) R]) • i?j || 
E R* ■ r]{E[R*]r]{l) E[R]) ■ R 

We estimate 



E 



R* ■ E [Xi E[R*] Xi Xi E[R] Xi] ■ R 

< \\R\\ ■ \\R*\\ ■ \\E[XiE[R*]XiXiE[R]X, 

<\\Rr.a,.\\E[R]\\.\\E[R*]\\ 

<a,-\\R\\' 



E 



R*ri{E[R*rj{l)R])R < ■ \\R\\\ 
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and 

e\r* ■ r]{E[R*]r]{l) E[R]) ■ r] < al ■ \\Rf 
Putting this together yields 
\\E[R^^{brX,R!^{brxa^R^MX,R^^{b)]\\ < ia, + 2al)-\\R^:} 
and finally 

llrPlI < ^ ■ Ja,{a, + 2al)-\mb)\\ ■ WR^^f. 



n 



We still need to replace, in G^n\b) = E[R^\b)] by 
E[Rn{b)]. By using the resolvent identity 

Rnib) = Rf{b) + -^R^^ib) ■ X, ■ R^ib) 

we have 

G^^{b)=Gn{b) + rf, 

where 

^E[i?W(6)X,i?„(6)]. 



As before, we estimate 

\\E[R^^{b)X,R^{b)]r < \\E[R^:}{b)Xa,R^}{br]\\ ■ \\E[R^{br R^ 

<a,-\\Rl^{b)f-\\Rn{hW- 
Let us summarize. We have 



i=l 



n 

i=l 



and the estimates 



\/n 



and 

\\rf\\<^^2-\\R^Km-\\Rnm- 

Jn 



It remains to estimate \\Rn{b)\\ and For those we use the usual 

estimate for Cauchy transforms (where Im6 := (6 — b*)/{2i) denotes 
the imaginary part of b), 

.. _ 1 .. ., 1 
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For a formal proof of this estimate, see, e.g.. Lemma 3.1 in [6]. 
We have now 

=r7(G„(6))-G'„(6)+r3, 

where 

1 " 

Hence 

11^311 < -E (211^11 ■ \\Gnm ■ llr^ll + hll ■ llrjlf + llrPll) < c„, 
where 

c„ := Cnib) := ^||--^||^v^' (2"2 + J 0:4 + 2a^) + -||7^||^a2- 



" Im 6 " V ?7, " Im 6 

Note that SnRn{b) = — 1 + bRn{b), hence 

E[SMb)] = bGn{b) - 1, 
and so we finally have found 

(4) r7(G'„(6)) ■ - + 1 = -r3, 
or the inequality: 

(5) ||r/(G„(6))-G„(6)-6G„(6) + l|| <c„. 

In order to get from this an estimate for the difference between Gn{b) 
and G{b), we will now follow the ideas in Section 5 of |6j, in the im- 
proved version from [5]. 

By (I2l), we have for all b G i3+ the equation 

(6) ^=gM + ''^^^^^) 

for G{b), and, by (jlj), the corresponding approximate version for Gn{b): 

(7) A„(6) = ^ + r/(G„(6)), 
where 

A„(6) ■.= b-rs-GrXbr\ 

A crucial point is now to show that for a sufficiently large set 0„ C 
the quantity ImA„(6) is still positive, so that we can also use equation 
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(ED for A„(6). Let us try 
0„ := |6 G I c„(6) < 1/2 and 

,„(„.(||,||+a,.||_i_||).||_i_||<V2}. 

The relevance of the condition c„(6) < 1/2 is the following: Let us 
denote 

Bnib) ■.= b-riiG„ib)), 
then inequality (JSj) takes, for b E On, the form 

||l-S„(6)G„(6)||<c„(6)<l/2. 
This, however, implies that Bn{b)Gn{b) is invertible with 

= <2, 

and thus 

\\Gn{br'\\ = \\Gn{b)-'B4b)-'Bn{ 
<2||5„(6)|| 
= 2||6-r^(G„(6))|| 
<2(||6||+a2-||G„(fe)||) 

<2( ||6||+a2 



Im 6 



But then the other condition in the definition of 0„ implies that for 
b G On we have 

(8) ||r3-G'„(6)-l<||r3||-||G'„(6)-i 

< Cn ■ 2 ( \\b\\ + a, ■ W-^W) < " ^ "-^ 



Im6 / Im6 
Since 

Im6 > ||-^|r^ ■ 1, 
Imo 

it follows that, for b E On, A„(6) = b — ■ Gn{b)~^ is still in ^3+ and so 
we can use the equation with A„(6) as argument, i.e., 

(9) ^"W-GIaW^"''"'^""'"'- 

The point of having both equation ([9]) and equation ([7]) is that this 

implies that 

G(A„(6)) = G„(6). 
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In [HI E] this was shown by analytic continuation arguments. We can 
simphfy that argument by using the fact from [7] that the equation 

(10) ^ = 

has, for any w with Imw > 0, exactly one solution G & B such that 
ImG is negative. Since both Gn{b) and G(A„(6)) have negative imagi- 
nary parts (as Cauchy transforms at some arguments) and both satisfy 
the same equation (ITUil (for w = A„(6)), they must agree. 

Then we can, still in the case b G On, estimate in the usual way, by 
invoking the resolvent identity: 

||G„(6)-G(6)|| = ||G(A„(6))-G(6)|| 

= \\G{An{b))-{An{b)-b)-G{b)\\ 

< ||(A„(6)- 6)11 ■||G„(6)|| -11^(6)11. 

Both ||G(6)|| and ||G„(6)|| can be estimated by ||l/Im6|| and for the 
first factor we have, by the second inequality in ([H]), that 

||(A„(6) - 6)11 = ||r3G„(6)-^|| < c„ ■ 2 (^||6|| + ■ \\^\\^ 
Thus, for 6 G On, we have shown that 

(11) ||G„(6)-G(6)|| <c„ •2^11611+ as 



Im6 / Im6 

For 6 G B+\On, on the other hand, we just use the trivial estimate 
||G„(6)-G(6)|| < 2-11^11 

together with 

• if we have c„(6) > 1/2, then 

" ^ " < 2c. " ^ " 



Im 6 Im 6 

1 „ „,„ „ 1 



<2c„ 



Im 6 Im 6 



<2cn-\\--f- \\b\\+a2 



Im 6 \ Im 6 

if we have c„(6) ■ (||6|| + 0^2 ■ ||i^||) • ||i^|| > 1/2, then we have 
again 

" ^ " <2c„- ( ||6|| +«2 



Im6 \ Im6 / Im6 

Thus we have proved the Theorem. 
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